Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
We discuss a broad class of difference‐based estimators of the autocovariance function in a semiparametric regression model where the signal consists of the sum of a smooth function and another stepwise function whose number of jumps and locations are unknown (change points) while the errors are stationary and ‐dependent. We establish that the influence of the smooth part of the signal over the bias of our estimators is negligible; this is a general result as it does not depend on the distribution of the errors. We show that the influence of the unknown smooth function is negligible also in the mean squared error (MSE) of our estimators. Although we assumed Gaussian errors to derive the latter result, our finite sample studies suggest that the class of proposed estimators still show small MSE when the errors are not Gaussian. Our simulation study also demonstrates that, when the error process is mis‐specified as an AR instead of an ‐dependent process, our proposed method can estimate autocovariances about as well as some methods specifically designed for the AR(1) case, and sometimes even better than them. We also allow both the number of change points and the magnitude of the largest jump grow with the sample size . In this case, we provide conditions on the interplay between the growth rate of these two quantities as well as the vanishing rate of the modulus of continuity (of the signal's smooth part) that ensure consistency of our autocovariance estimators. As an application, we use our approach to provide a better understanding of the possible autocovariance structure of a time series of global averaged annual temperature anomalies. Finally, the R package dbacf complements this article.more » « lessFree, publicly-accessible full text available April 29, 2026
-
Abstract Identification of clusters of co‐expressed genes in transcriptomic data is a difficult task. Most algorithms used for this purpose can be classified into two broad categories: distance‐based or model‐based approaches. Distance‐based approaches typically utilize a distance function between pairs of data objects and group similar objects together into clusters. Model‐based approaches are based on using the mixture‐modeling framework. Compared to distance‐based approaches, model‐based approaches offer better interpretability because each cluster can be explicitly characterized in terms of the proposed model. However, these models present a particular difficulty in identifying a correct multivariate distribution that a mixture can be based upon. In this manuscript, we review some of the approaches used to select a distribution for the needed mixture model first. Then, we propose avoiding this problem altogether by using a nonparametric MSL (maximum smoothed likelihood) algorithm. This algorithm was proposed earlier in statistical literature but has not been, to the best of our knowledge, applied to transcriptomics data. The salient feature of this approach is that it avoids explicit specification of distributions of individual biological samples altogether, thus making the task of a practitioner easier. We performed both a simulation study and an application of the proposed algorithm to two different real datasets. When used on a real dataset, the algorithm produces a large number of biologically meaningful clusters and performs at least as well as several other mixture‐based algorithms commonly used for RNA‐seq data clustering. Our results also show that this algorithm is capable of uncovering clustering solutions that may go unnoticed by several other model‐based clustering algorithms. Our code is publicly available on Github at https://github.com/Matematikoi/non_parametric_clusteringmore » « less
-
Abstract TMEM16F is a Ca2+-activated phospholipid scramblase in the TMEM16 family of membrane proteins. Unlike other TMEM16s exhibiting a membrane-exposed hydrophilic groove that serves as a translocation pathway for lipids, the experimentally determined structures of TMEM16F shows the groove in a closed conformation even under conditions of maximal scramblase activity. It is currently unknown if/how TMEM16F groove can open for lipid scrambling. Here we describe the analysis of ~400 µs all-atom molecular dynamics (MD) simulations of the TMEM16F revealing an allosteric mechanism leading to an open-groove, lipid scrambling competent state of the protein. The groove opens into a continuous hydrophilic conduit that is highly similar in structure to that seen in other activated scramblases. The allosteric pathway connects this opening to an observed destabilization of the Ca2+ion bound at the distal site near the dimer interface, to the dynamics of specific protein regions that produces the open-groove state to scramble phospholipids.more » « less
An official website of the United States government
